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PREFACE 


This paper is an attempt at an elementary presentation 
of what the author considers to be the most applicable and 
interesting properties of generalized inverses of matrices. 

We shall pay particular attention to a discussion of those 
properties concerned with the pseudo-inverse of a rectangu- 
lar matrix. This structure was first studied by Moore, who 
used the name general reciprocal, and later it was studied 
by Penrose, who used the name generalized inverse. 

The present discussion is what I deem to be the most 
practical way to present this material with the expectation 
that some of the readers of this paper will not be very well 
acquainted with abstract mathematics. I should like to 
apologize if all credit for work is not properly placed and 
refer those interested readers to the list of references, 
where practically all the material presented here is discussed 
in detail. 

I wish to thank all the people who were involved in the 
preparation and presentation of this paper. In particular, I 
should like to express my gratitude to E. R. Lancaster, who 
proof-read the manuscript and made many valuable suggestions, 
and to C . A. Rohde, who was kind enough to let me read his 
doctoral thesis (Ref. 25), and whose suggestions were of 
immeasurable aid to me in the compilation and presentation 
of this material. 
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I. PRELIMINARIES 


To begin, we shall recall some results and definitions 
from matrix theory. If n and m are natural numbers, an n x m 
matrix A will be a rectangular array of real numbers 


an • • 


a. 


im 


which will be 
The transpose 
whose (i,j)th 
matrix [a . 
matrix 


denoted by [a. .], or A. ., 1 £ i £ n, 1 £ j £ m. 
of A, denoted by A', will mean the m x n matrix 
entry is a^. The i th row of A is the 1 x m 
. . a.-jjjjl* The jth column of A is the n x 1 


a 




Thus, the transpose A’ of A is the matrix whose columns are 
the rows of A. We note that for matrices A and B whose product 
is defined, (AB) ’ = B*A’, and for A and B whose sum is defined, 
(A+B) * = A'+B'. An m-row vector will mean a 1 x m matrix, 
and an m-column vector will mean an m x 1 matrix. When the 
meaning is clear from the context, an m-row or column vector 
will simply be called an m- vector, or vector. Note that if 
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x is an m- col umn vector, then x' is an m-row vector and vice 
versa. In what follows, unless otherwise stated, we shall 
always mean that x is a column vector, and we shall write x' 


when we wish to refer to the corresponding row vector. 
Let A, B be n x m matrices with A = [a^j] and- ® 


ij- 


We define the n x m matrix A+B by (A+B)^ = [a^-H *>^3* If 
A is n x m, B is m x p, we define the n x p matrix C = AB by 


m 

c. . = ) a., b. . 

ij L ik kj 

k=l 

The n x m matrix 5 is the matrix all of whose entries are 
zeros. The n x n matrix I n is the n-identity matrix [5^*] 
where 


0 if i / j 
1,5 1 if i = j 


Thus 


1 0 . . . 0 ! 

0 1 . . . 0 


0 0 . . . 1 


The n x n matrix D is called a diagonal matrix, denoted by 
D = diag {a i , . . . , a n } if the non-diagonal entries are 
zeros, and = a^, 1 < i £ n. 

Thus D = diag { a^, . . . , a n ) = 

The n x n matrix A is called symmetric if A = A’, and 


ai 0 . . 0 
0 a2 

• • • 

• • • 

• • • 

0 . a 
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idempotent if A 2 = A. 

Let A be an n x m matrix. An n x n matrix B such that 

BA = A is called a left identity for A, while an m x m matrix 

T such that AT = A is called a right identity for A. The n x n 

matrix A is called invertible or non- singular if there is an 

n x n matrix B such that AB = BA = I . When such a matrix B 

n 

exists, it is unique and denoted by A” 1 . A _1 is called the 

inverse of A. If the matrix A is not invertible, it is c all ed 
singular. A matrix P such that P -1 = P' is called an orthogonal, 
or orthonormal matrix . 

We shall need the following result which we state without 
proof (See Ref. 3)- 

Theorem 1.1 : Let A be an n x n symmetric matrix. Then there 

is an orthogonal matrix P such that 


PAP’ = D = diag ( Xi, X 2 , ..., Xr, 0, ..., 0} 

where the X are non-zero scalars which may or may not be 
distinct. 

The matrix R with s non-zero rows is said to be in row 
reduced echelon form if the following are satisfied: 

i) If a row is not all zero, its leading non-zero 
term is 1. 

ii) All of the non-zero rows are above the rows 
consisting only of zeros. 
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iii) If the i th non-zero raw has its leading non-zero 
term in the jth column, then the leading non- 
zero term in each row above row i appears in 
a column to the left of column j; for 1 £ i £ s 
iv) If the leading non-zero term in the i th row 
appears in the jth column, then the other 
entries of the j th column are all zeroes. 

The matrix C is said to be in column reduced echelon form 
if C ' is in row reduced echelon form. We may sometimes speak 
of R as being row reduced or in row reduced form. 

It is well-known (see Ref. 17) that any n x m matrix A 
can be put into a unique row reduced foim by elementary row 
operations. If R is this row-reduced form of A, there is a 
non- singular n x n matrix P such that PA = R. 

Similarly, we may use elementary column operations to 
obtain a column reduced form B of R. When this is done, it 
is seen that there is a non-singular m x m matrix Q such that 

PAQ = B = 

where the 5's stand for zero matrices of appropriate sizes and 
r £ min {m,n}. 

We shall occasionally have use for so-called partitioned 
matrices. These are matrices whose entries are themselves 
matrices. We observe that, assuming all operations are 
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defined, "these matrices may be handled as though the entries 


were scalars. When using partitioned matrices, we may use lines 

IK 

to separate the submatrix entries. For example, if P = — where 

P2 

Pi is an Dj x n 2 matrix, P 2 is an n 3 x n 2 matrix and 
Q = [Qi | QeJ where Qi is an Hg x n 4 matrix, Qe is an n 2 x n 5 
matrix, then we have PQ is an (ni+n 3 ) x ( 114+05) matrix given by 


PQ = 




p 2 


[Qi | Qe] 


P1Q1 

P2Q1 


P1Q1 

| P2Q1 PeQe 

■4 

If A is an n x n matrix, the trace of A is defined by 


:] 


n 


tr 


*- 1 


Hi 


i=l 


The trace has the following properties: 

i) For matrices A, B such that AB and BA exist, 

tr(AB) = tr(BA). Hence, tr(PAP _1 ) = tr A, 

where P is non-singular and PAP -1 exists. 

ii) tr(A+B) = tr(A) + tr{B) where A+B exists. 

A finite set of vectors {x i , ..., x^} is called linearly 

dependent if there are n real numbers c,, .... c , not 1 
n 1' ' a' 

zero, such that c^x^ = 0. 

i=l 

If {x.}? are not linearly dependent, they are said to be 
linearly independent. 

A row or column vector space V is a collection of row or 
column vectors such that 
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i) x, y e x+yeV and 
ii) xeV and c, a real number ^ cxeV. 

A subspace W of a vector space V is a subset of V which 
is itself a vector space. Thus, for example, the collection 
of all n-column vectors (n is fixed) whose first coordinate 
is zero is a subspace of the vector space consisting of all 
n-column vectors. 

If {x^li is a finite collection of vectors, a vector x 

is said to be a linear combination of the vectors x. , ..., x 

i* ’ n 

if there are n real numbers ci, .... c such that 
n n 

x = y c i x i’ ® le coHsc'ti 011 V of all vectors which are 

linear combinations of xi, ..., x n forms a vector space, 

V = L {xi, . . x n ], called the space spanned by the vectors 

x i> . x n » A linearly independent spanning set of vectors 

for a vector space V is called a basis for V. It can be 

proved (Ref. 17) that every vector space has a basis and that 

any two bases for the same vector space have the number of 

elements. In all of our discussions involving a vector space 

V, we shall assume that there is a finite basis for V. We shall 

say that V is finite dimensional and that the dimension of V, 

dim V, is n, where n is the number of elements in a basis for V. 

For example, let V be the set of all n-column vectors. 

It is clear that the n-vectors e. = [0, ..., 0, 1., 0, 

J J 

ibrm a basis for V, and that this basis has exactly n elements. 
Now we return to an n x m matrix A. The row (column) 
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space of A is the vector space spanned by the rows (col umns ) 
of A. The dimension of the row (column) space of A is 

called the row (column) rank of A. It can be shown that for 

any matrix, row rank A = column rank A. This common number 
is called the rank of A and denoted by rk A. This shows that 
for any matrix A, rk A = rk A ' . 

If x and y are n-column vectors, the inner or dot product, 
(x,y), of x and y is the number a so that x'y = [a]. In this 
situation we may choose to identify the matrix [a] with the 
real number a itself and write x’y = a. 

The inner product has the following properties: 

i) (x, y) = (y, x) 

ii) (x, y) = (x, ay) = (ox, y) = a(x, y) 

iii) (x x + x 2 . y) = (x x , y) + (x 2 , y) 

iv) (x, y x + y 2 ) = (x, y x ) + (x, y 2 ) 

v) (x, x) ^ 0, and (x, x) = 0?^x = 5. 

Here x, x x , x 2 , y, y x , y 2 are all vectors of the same 
dimension, and a is a real number. Property i) is called 
commutativity. Properties ii) - iv) characterize the inner 
product as being bilinear function. Property v) is called 
positive definiteness. For a vector x, the norm or magnitude 

i 

of x is the number (x, x ) z . This is denoted by ||x|| and has 
the following properties: 

i) I) x || Sr 0 and || x || = O^x = 6 

ii) || a x || = | a | |1 x || 

iii) H x + y || * || x || + || y || 
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Let A be an n x m matrix, y be a fixed n-vector. Consider the 
numbers ||Ax-y|| as x varies through all m- vectors. We define the 
unique number inf ||Ax-y|| to be the number a such that 

i) q, g ||Ax-y|| for all m-vectors x. 
ii) If g is any number so that g ^ ||Ax-y|| for all 
m-vectors x, then g ^ at 

Two vectors x and y are called orthogonal if (x, y) = 0. 

We write x X y. A vector x is orthogonal to a vector space V if 
for all y e V, xX-y. We write xXV. Similarly, two vectors spaces 
V and W are orthogonal if, for all xeV, y e W, xXy. 

We will need the following well-known results: 

Theorem 1.2. If A is an n x m matrix, B an m x p matrix, then 
rk AB ^ rk A, and rk AB ^ rk B. 

Proof : Let us first note that for any matrices A and B whose 

product is defined, the rows of AB are in the row space of B, and 
the columns of AB are in the column space of A. 

Now, if W and V are vector spaces with WCV, then dim W ^ dim V. 
Hence, 

rk AB = row rk AB ^ row rk B = rk B and 

rk AB = col rk AB ^ col rk A = rk A. Q.E.D. 

If A is an n x m matrix, the range of A, R(A), is the column 

space of A. The null space of A, N(A), is the collection of m-vectors 

x such that Ax = 0. It can be shown that 
(1) dim R(A) + dim N(A) = m 

Thus, 

rk A + dim N(A) = m 
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The reason for the name of R(A) is clear since for any m-vector 
x. Axe column space of A. 

We say that a vector space V is the direct sum of subspaces 
Wi and W 2 , written Wi^W 2 , if every vector xeV can be written 
uniquely as 


x = xi + x 2 with xi e Wj, x 2 e W 2 . 

If W is a subspace of V, the set of vectors orthogonal to W is 
a subspace of V called the orthogonal complement of W, denoted 

X A x 

by W .It can be proved that V = W • W . If a vector space V 
is the direct sum of W* and W 2 , then WiOW 2 = {6}. 

If the vector x e V is written in its unique Sacm x = xi + x 2 
with xi e W, x 2 e , then the vector xi is called the orthogonal 
projection, or, more simply, the projection of x onto W. We write 

xi = pro j (x; W) . 

Theorem 1.5 Let A be nxm. Then N(A‘A) = N(A), and hence dim N(A’A) = 
dim N( A) . 

Proof: We first show that N(A'A)^ N(A) . Let x e N(A'A). Then 

A 'Ax = 0 ^ x’A’Ax = 0 ^(Ax, Ax) = 0 => Ax = 0^ x e N(A) . 

Conversely, let Ax = 0, then A'Ax = 0, and hence x e N(A'A). Q.E.D. 
Corollary I.l . If A is nxm, then rk A = rk A ’A = rk AA 1 . 

Proof : By (l), 

m = rk(A'A) + dim N(A'A) = rk A + dim N(A) and from 
Theorem 1. 3, rk A'A = rk A. 
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Now interchanging A and A' in this result, we obtain rk AA' 
rk A’ = rk A. Q.E.D. 

Let us remark, that although our development is confined 
to vector spaces over the real numbers, the analogous development 
for complex vector spaces requires only the substitution of A*, 
the conjugate transpose of A, for A', the ordinary transpose of 
A, in every statement involving A*. 
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II. GEHERALIZED IBVEBSES 


Perhaps the n application of matrix theory is the insight 
it gives us when we try to analyze a system of linear equations, 
say 


an + . . . + ai b x n » yi 


a xi + 
ai 


. + ax= y 
n m si & 


It is well-lmown that this system can be rewritten as the 


matrix equation Ax = y, where A = 



In the case where A is square and non-singular, a unique solution 
exists for every vector y. This is given by Xo = A~ x y. In the 
case where A is square and singular or A is not square, the system 
may have a solution, or it may not have one. 

The advantages of a generalized inverse, , of an arbitrary 
matrix A are the following: 

i) It always exists, and when the equation Ax = y 
is consistent, i.e. has a solution, Xq = A g y is 
one such solution. 

ii) If Ax = y is inconsistent, i.e. has no solution, the 
methods of working with a g.i. can be used to obtain 
a best approximate solution in the sense that we may 
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find a vector xo so that || Axq - y || is as small as 
possible. That is, we may find a vector x 0 so that 
|| Axo - y || = inf || Ax - y ||. , 

With this introduction, let us then make the 

I 

Definition II, 1 Let A be an n x m matrix. An m x n matrix A® 

is called a g.i. of A if AA g A = A. 

Theorem II. 1 - (Bose). If A is n x m, a g.i. A® of A always exists. 
Proof : (Ref. 25) We know that we may row and column reduce A 

to obtain the matrix B where 


B = 


I 


r 


0 


5 

6 


The rk A = r, and the number of zeros in each block is such that 

B is n x m. This is equivalent to saying that there are non-singular 
nxn mxm 

matrices P and Q such that 


(1) PAQ = B which implies 
A = P^BQ" 1 

It is clear that if B g is taken to be B', we have 

(2) BB S B = B. 

We define A® = QB g P. Then, 

AA g A = P' 1 BQ -1 QB^P^BQ" 1 

= P'^-BB^Q" 1 * = P'^-BQ" 1 = A. Q.E.D. 

Corollary II. 1 . If PAQ = B where P and Q are non-singular matrices, 
and B S is any g.i. of B, then QB g P is a g.i. of A. 
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Theorem II. 2 If A is nxm, PAQ = B as above, then every g.i. A g 

of A is given by QB®P where B g is some g.i. of B. 

Proof : Let A g be any g.i. of A. Then, 

BQ -1 A g P _1 B = PAQQ" 1 A g P" 1 PAQ = PAA g AQ = PAQ = B. 


Thus Q^A 6 ?" 1 is a g.i. of B, and A g = Q(Q -1 A g P" 1 )P. Q.E.D. 

We note that in general A 6 is not unique. In fact, if PAQ = B, 
where 


B = 



5 


6 

5 


it is easily seen that we may define 



where U, V, W are arbitrary matrices of appropriate dimensions, 
and we still have BB g B = B. Thus, in general, there may be many 
g.i. 's for a matrix. 

The following theorem justifies the naming of a g.i. 

Theorem II. 3 . If A is an n x n, non-singular matrix, then A g = A -1 ' 

Proof : Since A is non-singular, the only left identity for A is 

I . We know that AA g is a left identity for A. Thus, AA g = I , 
n n' 

and A g = A 1 by left multiplication by A 1 . Q.E.D. 

We should comment here that, in general (AB) g ^ B g A g . However, 
if B and C are non-singular, then 
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(bag ) 6 = c" 1 a®b” 1 . 


The next theorem, due to Penrose, gives the utility of the g.i. in 
solving a matrix equation. 

Theorem II A . Let A he n x m, B be p x q, C be n x q. 

1) The matrix equation A X B = C has a solution if and only 

if there are g.i. ’s A g of A and B g of B so that AA g CB g B = C. 

2) If A X B = C is consistent, then we obtain the most 

g g 

general solution X by choosing A , B fixed, but arbitrary g.i. *s 
and setting 

X = A S CB 6 + Y - A g AYBB S 
where Y is an arbitrary m x p matrix. 

Proof : l) Suppose A X B = C has the solution X 0 , then if A g 

g 

and B° are any g . i . 1 s of A and B, 

C = AX 0 B = AA S A X 0 BB S B = AA g CB g B. 

Conversely, if AA g CB g B = C for some A g and B g , then A g CB g is a 
solution to AXB = C. 

\ g g 

2) Suppose AXB = C is consistent, and A , B are any g.i. 's 
of A and B. Then the matrix 

X = A S CB S + Y - A S A Y BB g 
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is a solution for any m x p matrix Y. Conversely if X© is any 
solution, then X© is m x p, and AX©B = C. Hence, 

Xq = A S CB g + X 0 - A g CB S 

= A g CB S + Xq - A g A Xo BB g . Q.E.D. 
Corollary II . 2 If A is n x m, x and y are vectors of appropriate 
dimensions, then Ax = y is consistent if and only if there is a 
g.i. A g so that AA g y = y. When Ax = y is consistent, the most 
general solution is given by 

x = A g y + (I - A g A) z 
m 

where A g is any g.i. of A, and z is an arbitrary m- vector. 

We note that the general linear system 

3 ) A3XB3. + A2XB2 + . . . + A r XB r = C 

may be rewritten as follows . 

nxm mxp pxq. 

Suppose A^, X , 

Let 11 p \ = !V1 a K' • • K ] 

k k 

where is the ith column of B . 

ii) X = [X t . . X ], where X i is the ith column of X. 
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n x pm k k _ 

d i* - K \ i • • • iv \] 

vi) C = [C^| . . | C ] where is the ith column of C. 



Then 




may be rewritten 


11*4 


Thus (3) may be -analyzed as above. 
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III. MINIMIZATIQN OF SYSTEMS OF LINEAR EQUATIONS 


Now that the reader has seen the preceding results, he might 
wonder what happens if the system Ax = y is inconsistent. It 
turns out that through the use of generalized inverses, we may 
obtain vectors which are as "close" to solutions as possible in the 
following sense. 

Definition 1II.1 . A vector x is said to minimize the equation 
Ax = y if 

llAx - y|| = inf ||Ax - y || 

x 

We shall show that there is a unique vector Xq of smallest 
norm which minimizes Ax = y, and we will call this the minimal 
solution vector or, more simply, the minimal vector for Ax = y. 

When referring to the minimization of Ax = y, we may use the more 
suggestive terminology of the minimization of ||Ax - y||. 

To begin this discussion, we prove a well-known theorem which 
dates back to the time of Gauss. 

Theorem III.l The following conditions on a vector x are equivalent: 

1) x minimizes Ax=y. 

2 ) = proj (y; R(A)). 

3) x satisfies A'Ax = A’y- 

Proof :' We shall prove l)^^2) and then 2)^^3)< We need the fol- 
lowing lemmas: 

Lemma III.l. Let A be an n x m matrix then 

a) N(A) X = R(A’) 

b) N(A') = RU) - *” 
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ffroof of Lemma III.l, Since a) follows from b) and the facts that 
(R(A)"^) = R(A), and (A’) * = A, it is sufficient to prove b) . 

Let x e N(A'). We show that x e R(A) . ®hat is, we must 
show if t e R(A), (x,t) = 0. We know that there is a y so that 


t = Ay. Thus (x,t) =(x,Ay) = y* A'x = y'O = 0. 


Hence NtAOCRfA) . 

Now we show that if x / N(A'), then the vector y = A(A'x)eR(A), 
and ( y, x) ^ 0 . Indeed, if A’x y 5, then (A’x, A'x) y 0. But 


(A'x, A'x) = x'AA'x = x'y = (y,x). 


Hence (y,x) 0. Q.E.D. 

Lemma III .2, Let A be n x m, and Ax = y where x is an m- vector, 
y an n- vector. Then 


q- = inf (I Ax-y || = |j yi 

x 

where 


yi = pro j (y; R(A) ). 

Proof of lemma III. 2. Since y is an n-vector, we may write 


y = y 0 + yi with y G e R( A) , yi e R(A) 


Then 


Ax-y || = || Ax-y 0 || + || yi || ^ || yi j| 
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Since y Q e R(A), we conclude that 


a - llyill 2 . Q.e.d. 

Corollary III.l |[Ax-y|| 2 = a if and only if 

Ax = y G = proj (y; R(aJ. 

Row to the proof of theorem III.l: l)^^2) . This follows easily 

from Corollary III.l. 

2)^3). Suppose 2) is true. Let Ax = y Q then 
A'y = A'y 0 + A'yi = A'y 0 

since yi e R(A)"^*= N(A'). Thus A'y = A ’Ax and 3) is true. 
Conversely, let x he such that A ’Ax = A’y then 

Ax e R(A) and y Q 6 R(A) 

AX- y 0 e R( A) . 

But 

0 = A 'AX - A'y = A 'Ax - A’y 0 - A’ (Ax - y 0 ) 
so 

Ax - y Q e N(A’) = R(A)*^“ 

Ax-y o = 0^Ax = y o and hence 
x minimizes llAx-y||. Q.E.D. 

The advantage of this result is that it enables us to 
translate the problem of minimizing |[Ax— y || to the easier problem 
of finding a solution to the equation A'Ax = A’y. Such a 
solution always exists since A’y ® R(A’) = R(A’A). Later we 
shall be interested in finding the unique solution Xq of A’Ax = A’y 
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which has smallest norm, i.e., the minimal vector for Ax = y, 

but now we content ourselves with finding any x such that A 'Ax = A'y. 

According to what has been said (Corollary II. 2), a solution 
to A 'Ax = A'y is given by x = (A'A) g A*y where (A'A) g is any g.i. 
of A 'A. Since the equations A'Ax = A'y are called the normal 
equations for Ax = y, we are led to the following. 

Definition III. 2 . If A is an n x m matrix, then the m x n matrix 
A n = (A'A) g A' where (A'A) g is any g.i. of A'A is called a 
normalized generalized inverse of A. 

This structure was first studied by Zelen, who used the term 
weak generalized inverse. 

We observe first that AA n A = A, so A n is a g.i. of A. In 
fact, A n satisfies many more properties which we shall study 
later, and which we shall use to give an equivalent definition of 
A n . Let us note, in this terminology, that the vector x = A n y 
minimizes the equation Ax = y where A n is any normalized g.i. of A, 
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IV. PSEUDO-INVERSES 


Let us study the normalized g.i. A n . We -will show that if 
A n is a normalized g.i. with the property that (A n A) * = A n A, 
then the minimal vector of Ax = y is xq = A Q y. But, before we 
do this, let us look at things from a slightly different point 
of view. 

The following lemma and corollary which are due to Bose, 
are found in the paper by Rohde (Ref. 25). 

Lemma IV. 1, 

Let X* be a p x n matrix. There exists a p x n matrix Y 
such that 

1) (X'X)Y = X’ 

2) XY is unique in the sense that if X'XYi = X* 
and X’XYs = X' XYi = XY 2 . 

3) (XY) ' = XY 
k) (XY) 2 = XY. 

Proof 

1) To prove the existence of Y, all we need observe is that every 
column of X' is in the column space of X'X. Hence, there is a Y 
such that X'XY = X’. 

2) Suppose X *XYi = X ’ and X *XY 2 = X ' . We will show that for 
i = 1, . . . , n, the ith columns of XYi and XY 2 are equal. 

Let Yi i be the i th column of Yi , and be the ith column 
of Y 2 . 
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Then X'XYu = X’XY 2i 

^ X'X (Yu - Y 2i ) = 0 

=» (Y xi • Y 2i )'X'X (Y ai - Y 2i ) = 0 

the inner product (x(Yi^ - Y 2 i), X(Yii - X 2 ^)) = 0. 
^XY 1;L = XY 2i . Q.E.D. 

3) (XY)’ = Y'X* = Y'(X'XY) = (Y'X'X)Y 

= fc'XY) *Y = (X ') 'Y = XY. 

4 ) (XY) 2 = (XY) (XY) = (XY) 'XY 

= Y'X'X'Y = Y'X' = (XY)' = XY. Q.E.D. 

Corollary IV. 1 

The matrix A(A'A) g A', where (A'A) g is any g.i. of A 'A, 
is uniquely determined, symmetric and idempotent. 

Proof 

The equation A'AX = A' has a solution given by X = (A'A) g A'. 
Hence AX = A(A'A) g A' has the desired properties, by lemma IV. 1. 
Theorem IV. 1, (Rohde) 

The matrix A n is a normalized g.i. of A if and only if A* 
satisfies the following: 

5) AA n A = A 

.n A „n .n 
b) A AA = A 

7) (AA n ) ' = AA n . 

Proof 

Necessity. 

If A n is an n. g.i. of A ^ there is a matrix (A’A)^ such that 
A n = (A'A) g A» . 
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Then AA n A = A(A'A) g A'A = A, since (A'A) g A'A is a right identity 
for A 'A, and hence is a right identity for A. 

Similarly, 

A n AA n = (A’A) g A*A (A'A) g A* = (A*A) S A‘. 

Thus properties 5 ) and 6 ) are satisfied. Property 7) is satisfied 
by Corollary IV. 1. 

Sufficiency. 

From properties 6 ) and j) , we have that 

n xi, 

row space A" C. row space AA 

= column space AA n C column space A 

= row space A ’ . 

Thus, there is a matrix X such that A n = XA*. We show that X is 
a g.i. of A 1 A. 

Indeed, A'AXA’A = A'AA n A = A'A by property 5 ) Q.E.D. 

The characterization of normalized g.i.’s given in the above 
theorem is what is usually used to define these structures. Pur- 
suing this type of reasoning, let us, for an arbitrary n x m matrix 
A, consider the following equations: 

8 ) AXA = A 

9 ) XAX = X 

10) (AX) 1 = AX 

11) (XA) * = XA 

We have shown that 8 ), 9 ), 10) have a solution for any A, and we 
shall presently show that in fact these four equations have a 


2k 


unique solution for any A. But, before we do this, let us just 

mention that in the literature so far there have been four types 

of g.i. *s studied. These are obtained as follows: 

For an arbitrary matrix A, a matrix X is called 

a generalized inverse (g-inverse) if it 
satisfies 8) . 

a reflexive generalized inverse (r-inverse) 
if it satisfies 8), 9) • 

a normalized generalized inverse (n-inverse) 
if it satisfies 8), 9) > and 10) . 

a pseudo-inverse (p-inverse) if it satisfies 
8), 9), 10), and 11) . 

Our main purpose is the study of the p-inverse which has the 
particularly nice property that it is unique. However, to facili- 
tate some later proofs, we Bhall give the following two results due 
to Rohde, who studied in some detail the properties of the four 
types of g.i. 's (Ref. 25). 

Theorem IV. 2. 

For any n x m matrix A, and any g.i. A® of A, rank A® ^ rank A, 
and rank A g A = rank AA g = rank A. 

Proof : By theorem 1.2. 

rk A s rk A g A £ rk A S ; 
further | 

rkA ^ rk AA g *= rk AA g A = rk A, 
and 

rk A ^ rk A g A ^ rk AA g A = rk A. Q.E.D. 
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Theorem IV . 3 

The matrix A g is an r-inverse of A if and only if rk A g = rk A. 
Proof : 

If A g is an r-inverse of A, then A is a g.i. of A g . Hence 

rk A S rk A g s rk A 
rk A = rk A g . 

Now suppose rk A = rk A. 

By Theorem II. 2 and the remarks which follow it, we know that 

A S = QB®P where PAQ = B = 

, where U, V, W are arbitrary 
matrices of appropriate dimensions. 

We have that rk A g = rk A = rk B and rk A g = rk B g since 
multiplication by non-singular matrices preserves rank. 

Thus rk B = rk B®. 

If we show that this implies that B g BB g = B g , we have 




A g AA S = QB g PP” ^■QB®? = QP^EE^P 
= QB g P = A S 

which is our desired result. 

Thus we must show B g BB g = B g . 
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We have 


Hence, 


rk B g = rk 

-I 

r 

-V 

0 1 r I 

-J 

U-| rl 

wJ to r 


rl r 

u i r Ir 

-u - 

= rk 

Is 

w-vuJ L 0 


= rk 

[ Ir 

5 >rk[ 

’M] 


L 0 

W-VU 

■ 0 0 J 


= rk B 


rk B® = rk W-VU = 0 

=*• W = VU ^ B g - [* r JJJ 


Thus 


r Ir 

u pi r 

- 1 r Ir 

u ] 

L v 

vuJ L 0 

0 J L v 

vuJ 

1 

0 1 

u K 

u 

r r 

u r r 



[v 

> 
j 

1 

lo 

1 r 

VUJ = L V 

vu . 


= B S Q.E.D. 


Henceforth in this paper we shall be concerned with pseudo-inverses. 
We begin by giving the basic theorem of Penrose (Ref. 20 ) concerning 
the existence and uniqueness of p-inverses. 
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Theorem IV. 3. 


For any n x m matrix A, there is a unique m x n matrix A + , 
called the pseudo-inverse (p-inverse) of A, satisfying: 

12) AA + A = A 

13) A + AA + = A + 

14) (AA + ) * = AA + 

15 ) (A* A) ’ = A + A 

Proof 

We first note that AA ' = 0^^ A = 0 and then that 

16) BAA* = CAA’ BA = CA 

and 

17) A’AB = A 'AC AB = AC 

These follow respectively from 

18) (BAA’ - CAA’) (B-C) ’ = (BA - CA) (BA - CA) ' 

and 

19) (B-C)’ (A’AB - A’AC) = (AB - AC) » (AB - AC) 

Wow, since row space AA’ = row space A', there is a matrix W 
such that WAA ' = A ’ . Similarly, since column space A ' = column 
space A ’A, there is a matrix Y such that A 'AY = A'. 

Let us define A + = WAY. 

Since AWAA’ = AA' and A'AYA A A 'A, we have that AWA = A 
and AYA = A. Thus AA + A = AWAYA = AYA = A and A + AA + = WAYAWAY = 
WAWAY = WAY = A + . Hence 12) and 13 ) are satisfied. 

Wow, 

(AY) ’ = Y'A’ = Y'A'AY = (AY) ’AY 
which shows that (AY)’ is symmetric and hence AY = (AY)’. 
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Similarly, WA = (WA) 1 . 

Thus 

AA + » AWAY » AY = (AY) ' = (AA + ) ' 

and 

A + A = WAYA = WA = (WA) • =(A + A) • 

Hence A + = WAY is a p-inverse of A. 

Uniqueness: 

Let us note that for any p-inverse A + of A, rk A + = rk A. 
Further, we have 

row space A + Crow space AA + 

= column space AA + C column space A. 

= row space A* 

^ row space A + = row space A ' . 

Similarly, column space A + = column space A ' . Thus 
A’AA + = A*, and A + AA» = A*. 

Now, let G and X be p-inverses of A. 

Then 

GAA » = A 1 SAX = X and 
A 'AX = A' ^ GAX = G Q.E.D. 

The following properties of the p-inverse were obtained by 
Penrose. 

Theorem IV. 4. 

Let A be a matrix. Then 

20) (A + ) + = A 

21) (A') + =(A + )« 

22) A + = A -1 if A is non-singular 
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23 ) ( AA) + = X + A + , X a real number, 

0 if X = 0 

■where X + = 

~ if X / 0. 

A. 

24) (A'A) + = A + A + • 

25 ) If U and V are orthogonal matrices, then 
(UAV) + = V ' A + U * . 

26) If A = EA. where A. A* = 0 and A. T A. = 0 

1 1 J 1 J 

for i / j, then A + = E A. + . 

i 

27) A + = (A'A) + A* = A'(AA') + 

28) If A is symmetric and PA P* = D = diag [Xi, . . ,X n ,0, . .0], 
then A + = PD + P' -where D + = diag {X 1 1 , ..,X n 1 , 0, ..,0}. 

29) A + A, AA + , I-A + A, I-AA + are all symmetric, indempotent 
matrices. 

30) If A is normal, i.e. AA* = A’A, then AA + = A + A and 
(A m ) + = (A + ) m for m a positive integer. 

31) A, A'A, A , A A, AA all have rank: equal to Trace A A. 

Proof 

We shall sketch parts of the proof. 

Property 24) 

We have A*AA + A +, A*A 

= A , A +, A , A (See proof of Theorem IV. 3 .). 

= A'A. The other properties are similar. 

Property 2j) 

We have A(A'A) + A'A = A since (A'A) + A'A is a right identity 
for A'A, and hence is a right identity for A. 


30 



Property 29) 


Suppose AA' = A' A. Then 


AA + = (AA + ) 1 = A + 'A' = A +, A + AA' 

= (AA') + AA* = (A'A) + A'A = A + A. 

Property 30) 

We know that rk A = rk A'A = rk A + by Theorems I. 3 and IV. 3. 


Now 


rk A = rk AA + A £ rk A + A = rk A 


so rk A = rk A + A, and similarly rk A = rk AA + . Since A + A is 
symmetric and indempotent, there is an orthogonal matrix P such that 


Then 


PA + AP ' = diag [ 1 , 1 , . ., 1 , 0 , . ., 0 }. 


tr A + A = tr PA + AP' = rk A + A. Q.E.D. 

We observe that, in general, (AB) + B + A + . However, Cline 

(Ref. 5) has shown that we may find matrices Bi and Ax such that 


AB = AxBx and (AB) + = Bi + Ax + 

In fact Bi = A + AB, and Ax = ABxBx + . 

Now that we have the p- inverse, let us return to our problem of 
finding the minimal solution vector of a matrix equation Ax = y 
where A is n x m. Recall that we have shown that a vector Xq 
minimizes the equation Ax = y if and only if A'A Xq = A'y, We have 
also shown that A'Axq = A'y if and only if (A'A) g A'y = A'y, and that 
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if Xq = (A'A)^A t y, then Xq minimizes Ax = y, where (A’A)® is an 
arbitrary g.i. of A'A. 

We thus know that if Xq = A n y, for some n-inverse A n of A then 
Xq minimizes Ax = y. We now ask what further conditions we must 
put on Xq so that it will be the minimal vector for Ax = y. 

We shall show presently that a necessary and sufficient condition for 
x Q to be the minimal vector is that it minimize Ax = y and belong to 
RCA')- We will also show that there is only one minimal vector for 
Ax = y. For the time being, let us assume we know these results. 

We then could write any minimizing vector x as xi + x 2 where Xi e R(A’) 
and x 2 e R(A') . Then, since Ax 2 = 0, ^(A^ N(A)), we would 

have that Ax = y, and xi eE(A'). Thus we would have that Ax = Axi, 
which would mean that xi also minimized Ax = y, and xi e R(A'). But 
we know that we can obtain a minimizing vector x by setting x = A n y 
where A n is any n-inverse of A. From what we have said, if A n y were 
in R(A')j A n y would be precisely the minimal vector for Ax = y. 

Row it seems reasonable to ask just what restrictions on A n 
we need to insure that A n y e R(A'). This we state as the following 
theorem which is another form of a result stated by Albert (Ref. 2). 
Theorem IV. 5 . 

Let A be an n x m matrix; A n an n-inverse of A. Then if (A n A) 1 
= A n A, i.e., if A n is the p-inverse of A, A n y = A + y e R(A*) for all 
y. Conversely, if an n-inverse A n is such that A n y e R(A') for all y, 
then A n = A + . 

Proof 

If A n = A + , then A + y e R(A ') = R(A') 

Conversely, if A n y e R(A’) for all y, then we must show that A n = A + . 
In other words, we must show that 
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A + is the only matrix M satisfying: 

32) row space M a row space A'. 

33) column space MC column space A*. 

34) AM is a left identity for A. 

35) MA is a right identity for A. 

We first prove that AA + is the only left identity for A with rows 
in the row space of A', and A A is the only right identity for A 
with columns in the column space of A'. 

Indeed, suppose that B is a left identity of A of the form X A’. 
Then 

XA'A - AA + A = 0 

^ XA'A - A(A'A) + A'A = 0 by 27) 

^ XA' = A(A'A) + A' by 16) and the fact that (A*)' = A. 


Thus 

B = XA' = AA + . 

Similarly, A + A is the only right identity for A with columns 
in the column space of A 1 . 

Wow, to complete the proof of Theorem IV. 5., suppose that Ai 
is any matrix satisfying 32) - 35)- 
Then 

Ax = AxAA + = A + AA + = A + . Q.E.D. 


The basic idea of the last proof is found in a paper by 
Greville (Ref. 13) . 

We now prove the theorem which will pick up all the loose ends 
we have left. 

Theorem IV. 6 

36) A minimizing vector x Q for the matrix equation Ax = y 
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is the minimal vector for Ax = y if and only if Xg e R(A'). 

37) The minimal vector x Q is unique, and is given by Xq = A + y. 

Proof 

Suppose Xq is the minimal vector for Ax = y. Then Xq has the 
unique decomposition Xq = Xi + x 2 where xi = proj (xqj R(A*)) and 
x 2 = proj (x 0 ; R(A') ). 

Since x 2 e R(A')"*" = N(A), we have by Theorem III.l, Axq = Axi 
= proj (y; R(A)). Thus xi also minimizes Ax = y. But 


|ixof = |i Xl |! 2 + |bc 2 |i 2 ^ |! Xl |; 2 


=> |!xo|| 2 .= |jxi|| 2 S>X 2 = Xq = X! 

Hence 

x 0 e R(A*). 

Conversely, let x Q be a minimizing vector for Ax = y which is in the 
range of A'. Let x be any minimizing vector for Ax = y. We shall show 
that jbd! > (Ixq!!, and, in fact, Xq = proj (x; R(A')). Write x = t x + t 2 
with ti e R(A’), t 2 e R(A') X = N(A) . 

Then 

Ax = Ati = proj (y; R(A)) = Axq 
xq - t x e N(A) = R(A') - ^” 

But 


Xq - ti e RCA’ ) 3^ x 0 = tl. 


Now for the uniqueness, if Xq, t Q are two minimal vectors for Ax = y, 
the x 0 = proj (t Q ; R(A')) = t 0 . 
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Since A + y minimizes Ax = y, and A + y e R(A')> A + y is the minimal 
vector for Ax = y. Q.E.D. 

We shall summarize our results in the following theorems . 

Theorem IV. 7 

Let A be an n x m matrix with real entries. Let x be an 

m- vector; y, an n-vector; = inf || Ax - y||. Then the minimal vector 

x 

x 0 for Ax = y is xq = A + y where A + is the p-inverse of A. This 
vector satisfies 

38 ) ||Axq - y || = or, and if x is such that x ^ x Q and 
||Ax - y|| = or, then ||x|| > ||x 0 II . 

39) x 0 e R(A»). 

40) Axq = proj (y; R(A) ) . 

These results are a slight reformulation of those found in 
Albert (Ref. 2). The following theorem, -which is to be found in 
Albert, summarizes the main applications of the p-inverse. 

Theorem IV . 8 

41) Let A be an n x m matrix with column vectors Ai,...,A m< 
Then, if L(Ai, .... A ) = R(A) is the space spanned by these 

3 ±3 3 jn 

vectors, y Q = proj (y; L(Ai, ..., A m )) = AA + y. 

42) yi = proj (y; N(A')) = (l-AA + )y 

43 ) x minimizes ||Ax-y|| if and only if there is a z such that 

x = A + y + (I-A + A)Z. 

In closing this section, we observe that there are at least 
three alternative methods of defining the p-inverse A + of a matrix A. 
The first two methods we have already mentioned, and we state now as 
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Theorem IV .9. 


M+) The p-inverse A + of A is the unique matrix such that 
for any m- vector y, A"y e R(A'), and A + y minimizes Ax = y. 

I 4 - 5 ) The p-inverse A + is the unique matrix satisfying 

32 ) - 35 ). 

Albert (Ref. 2) uses another equivalent definition which we 
give as 

Theorem IV. 10. 

For any n x m matrix A, 

A + = lim (A'A + e i) _ 1 a* 
e-*o 

= lim A* (AA* + e I ) -1 
e-*o 

where A’A + el will always be invertible if |e| is less than the 
absolute value of the smallest non-zero characteristic value of A’A. 
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V. COMPUTATION OF THE PSEUDO -INVERSE 


Having discussed at length the geometric applications of 
the p-inverse, it seems desirable to have at hand an economical 
method for its computation. In view of property 27 ) of section IV, 
it suffices to find the p-inverse of matrices of the form A 'A. 

The following theorem shows that it suffices, in fact, to find an 
arbitrary g.i. of a matrix of the form A 'A. 

Theorem V.l, 

If A is symmetric, then A + = A[(A 2 ) g A] 2 where (A 2 )® is an 
arbitrary g.i. of A 2 . 

Proof 

This is a straightforward application of Penrose's method for 
the computation of the p-inverse. 

We solve l) WA = A and 
2) A 2 Y = A. 

A solution of l) is Wq = A(A 2 ) g , and a solution of 2) is 
Y 0 = (A 2 ) g A. 

Then 

A + = W Q A Y 0 = A(A 2 ) S A(A 2 ) S A 
= a[(a 2 ) s a] 2 . Q.E.D. 

Now suppose A = H'H. Then to find A + , all we need to find is (A 2 ) g . 
Since A 2 = (H’H) (H'H) = (H'H)'(H'H), we have reduced the problem 
to that of finding the g.i. of a matrix of the form H'H. 

CP 

Let us remark that in the proof of the existence of a g.i. A 
of an arbitrary matrix, we showed how to compute one. Our method 
involved a row reduction, i.e. pre -multiplication by a non-singular 
matrix P, and a column reduction, i.e. post multiplication by a 
non -singular matrix Q. 
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Rao has shown (Ref. 23) that when A is of the form H'H, we may 


row-reduce A by pre -multiplication by a non-singular matrix P, 

and that P is a g.i. of A, i.e. APA = A. Applying this to the 

above theor em , we see that finding (A 1 A)® involves only ordinary 

row reduction. To prove Rao’s result we need some additional terminology. 

Definition 


We say that the n x n matrix A has the row-reduced zero 
property (A has r.r.z.p.) if its row reduced echelon form has the 
property that when its ith diagonal element is zero, its ith row 
is composed only of zeros. 

Lemma V.l. (Rao) 

Let A be an n x n matrix with r.r.z.p. Let R be its row 
reduced echelon form. Let P be a non-singular matrix such that 
PA = R. Then 

3) R is idempotent; i.e. R 2 = R. 

4) AR = R 

5) APA = A, and hence P is a g.i. of A. 

6 ) A necessary and sufficient condition that Ax = y 

be consistent is that if the rith, r 2 th, coordinates 

of Py must be null. 

7) A general solution of Ax = y is Py + (I - PA)z 
where z is arbitrary. 



We have three cases 
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Case 1: i > j 


Then 


J 


n 


-l 


r Jt . r,_, + ^ r_. ,_r,. 


'i j ~ L, ik kj Lj ik kj 
k=l k=j+l 



Case 2: Let i = j be fixed. Then either r^ = 1 or r^ = 0. 

If r.. = 1, then r, . = 0 for L r j, i.e. for all other entries in 
the j th column of R. 

Then the jth column of C is the same as the jth column of R. 

If r„ = 0, the j th row of R is composed only of zeros. Hence, the 

same is true of the j th row of C. 

Thus r . . = c . . for all i ^ j. 
iJ iJ 

Case 5 : Now let r. . be fixed where i < j. Then r . . / 0 

— ij ij 

if and only if r. . =1 and r. . = 0. This implies that c = r. . 

ii J J i j ij* 

Thus R 2 = R. 


L) PA s R^A = P -1 R = P " 1 R-R = AR. 

5 ) We have APA = AR = A. 

6 ) See Hofflnan and Kunze (Ref. 17 ), Chapter 1. 

7 ) This follows from Corollary II. 2. Q.E.D. 

The orem V , 2 . ( Rohde ) 

If A = H'H where H is any matrix, then A has the row reduced 
zero property and hence the results of lemma V.l. are true for A. 
Proof 

We prove this by induction on the dimension of A. 

Let A = H'H be annxn matrix, and assume that the result is 
true for all matrices B of the form B = S’S where dimension B < n. 
We may begin to row reduce A. Suppose that we have completed row 
reduction of the first i -1 rows, and we wish to continue with the 
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reduction of the ith row. Let us see what we have done. 
We started with A which we can write as follows: 



W' 

[hi'H x 

Hi'Xi 

H 2 'H 2 

A a 

Xl' 

[H 2Xi H 2 ] = Xj_ 'Hi 

Xi'Xi 

x i 'h 2 


h 2 ; 

|H 2 'Hi 

H 2 'xi 

h 2 'h 2 


where Hi is the matrix of the first i-1 columns of A, is the 
i th column of A, and H 2 is the matrix of the remaining columns of A. 

We first observe that the (i-l) x (i-l) matrix Hi 'Hi satisfies 
our inductive hypothesis. Thus, if we row reduce Hi 'Hi, we do this 
by pre -multiplication by a non-singular matrix (Hi'Hi) 6 . With this 
comment it should be clear that the non-singular matrix which has 
row reduced the first i-l rows of A is 

(Hi'Hi) 6 0 

0 1 

0 0 

Now that we wish to work with the i th row of A, let us do this 
in detail. We have put A in the form 



(H 

i’Hi) S 

0 

01 

Hi 'Hi Hi'*i 

Hi 'Ha 



0 

1 0 

x ± 'Hi Xi 'Xi 

x i 'h 2 


- 

0 

0 I 

H 2 'Hi H 2 'Xi 

H 2 'H 2 

= 

R i 

(Hi'Hi) g H 2 'x i 

(Hi 'Hi)^! 'H 2 



Xi'Hi 

Xf'Xi 

Xi 'H 2 



H 2 'Hi 

Hs'Xi 

1 

w 


where Ri 

is 

the row 

reduced form of Hi 'Hi. 



6 

0 

I 


bo 



To reduce the ith row of this matrix, we pre -multiply by 



to obtain 



where 





'H 2 

Si 

H 2 'H 2 


f i = x i' x i " x i ‘Hi(Hi , Hi) g Hi , x 1 

and 

g. = Xi'Hs - x i «H 1 (H 1 *H 1 ) S H 1 ‘H 2 

Now we assume f. =0, and we wish to show that g. = 0. We have that 

0 = f. = x.' [I - H^Hx’Hx) 2 ^'] x.. 

By Corollary IV. 1., Hx(Hx 'Hx)%x ' is symmetric and idempotent, hence, 
so is 

I-H 1 (H 1 ‘Hi) S H 1 ' 

Thus 

o = f i= x ± * ( I-Hx(H x ’Hx) S Hi 1 ) ' (l-H 1 *(Hx'Hi) g Hi")x i 
s^x i r (I-Hx (Hx , Hi) S Hi') = 0 
=^’X i '(l-Hx(Hi'H 1 ) S Hi , )H 2 = 0 

= x.'H 2 -x i ‘Hi(Hi’Hi) g Hi‘H 2 = g. 

Since this is true for i < n, A has r. r-z-p. Q.E.D, 

4l 


Let us recapitulate briefly. To find a g.i. of the 
n x n matrix A = H*H, we merely adjoin I n to the right of A to get 

[A I Ini 

Then we row reduce this matrix and get 
[A g A|A S ] 

To consolidate our computational method, we have 
Theorem V.$ . 

Let A be an arbitrary n x m matrix. Then 
A + = ( A'A) + A * = A'(AA») + 
and (A'A) + = A ’A [( (A’A^^A'A] 2 
Thus A + = A* A [((A’A^^A'ApA* 
and 

a similar formula holds involving AA*. This method involves six 
matrix multiplications and one row reduction. 

So far, the simplest method we have seen for finding the g.i. 
of an arbitrary matrix involves a row reduction, column reduction, 
and two matrix multiplications (Theorem II. 2.). The following 
method, due to Frame (Ref. ll), shows that we may find a g.i. of 
an n x m matrix A by little more than ordinary row reduction. 
Definition 

8) The distinguished columns of the n x m matrix A of rank s, 
are those linearly independent columns which are obtained by 
starting at the first column on the left, moving to the right, and 
deleting any column which is a linear combination of the columns 
preceding it. 

9) A is said to have the rank factorization A = BC where B 
is the submatrix of the distinguished columns of A in their natur al 
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order, and C is the submatrix of the top s rows of the row reduced 
echelon form R of A. 

We note that rk A = rk B = rk C, and that every matrix has a 
unique rank factorization. 

Wow we keep A an n x m matrix of rank s . Let L be an n x n 
non-singular matrix such that LA = R, where R is the row reduced 
echelon form of A. Let Li be the submatrix of the top s rows of L, 
and C be the submatrix of the top s rows of R. Let L2 be the sub- 
matrix of the remaining rows of R. The other rows of C are composed 
only of zeros. The distinguished columns of C are those of I . If 
we let V be the submatrix of the other columns of C in their original 
order, then, writing the s x m partitioned matrix [I IV] , we see 
that we may interchange columns in [I IV] to get back to C. Letting 
P be the matrix obtained by interchanging the appropriate columns of 


-Pi- 


I m , we have C = [I s |V]P. Now, we may write P = j"— J where the s x m 
matrix Pi consists of the distinguished columns of C, and the (m - s) 
x m matrix P 2 consists of the other rows of the identity. Because of 


the way we have chosen P = 


|pj we have that 


0 - [ft] - Pi + VPe 

and 

P " 1 = P« = [Pi' IPs'] 

We assert that the n x s matrix B = APi 1 consists exactly of the 

distinguished columns of A. Let us see why this is so. The m x s 

matrix Pi * has as its columns some of the columns of I . Since the 

m 

non-zero columns of Pi are the distinguished columns of C, they occur 
in the same positions as the distinguished columns of A. 



That is, the jth column of Pi is non-zero if and only if the j th 

column of A is a distinguished column of A. Now the i, jth entry 

of Pi is non-zero if and only if the ith row of Pj is e . * = 

J 

[0, . . ,0,1 .,0, . .0]. When we multiply AP 1 , the ith column of this 
J 

product is the jthcolumn of A, since the i th column of this product 

is just A times the ith column of Pj ' which is [ 0 . .0,1.,0. ,0]'. 

J 

Using the fact that B = A?i ' is the matrix of the s distinguished 
columns of A, we see that A has the rank factorization A = BC = 

APi ’Li A. 

We state this formally with another observation as 
Theorem V.k. 

Let A be an n x m matrix of rank s. Let L, C, Lj, L 2 , Pi, P 2 
be defined as above. Then the matrix A = Pi’Li is a reflexive g.i. 
of A; i.e. A r satisfies AA r A = A, and A r AA r = A r . 

Proof 

T T 

By the above argument, we conclude that A satisfies AA A = A. 
Further, by the structure of Pi’, we see that Pi’Li has as its 
non-zero rows exactly the rows of Li, which are linearly independent. 
Hence rk Pi ’Li = rk Li = s = rk A. Thus, by Theorem IV. 3 ., Pi 'Li 
is an r-inverse of A. Q.E.D. 

The following corollary is also due to Frame. 

Corollary V.l. 

Let the equation Ax = y be consistent where A is an n x m matrix 
of rk s. Then the most general solution is given by 

x = Pi *Liy + (Pa' - Pi 'V)z 
where z is an arbitrary (m-s) -vector. 



Proof 


45 


All we need prove is that every vector (l-A g A)zi can be written 
as (P 2 * - Pi 'V)z. Since A(l-A S A) = 0, we have that C(l-A g A) = 0. 

Let us then determine the form of those matrices S such that 
CS = 0. We have that C = ("I |V]P. Hence the mx(m-s) matrix Aq = 

p* T "x J is such that CAq = 0. We then have that the columns 
m-s 

of Aq must belong to K(C). But rk Aq = m-s = dim N(C) since 
dim N(C) + rk C = m. Thus, column space Aq = N(C). Now, since 
col umn space Cl - A g A)dN(C) = column space Aq, there is an (m-s) 
vector z such that (l-A g A)zi = AqZ = (P 2 * - Pi* V)z. Q.E.D. 

To help clarify the ideas of Frame’s method, we present the 
following example. 

Consider the system of equations 

2xi + 4x 2 + x 3 + 4x 4 + 4x 5 = -1 

xi + 2 x 2 + 2x 4 + x 5 = 0 

x 3 + X5 = 1 

2x x + 4x 2 + 2x 3 + 4x 4 + 5x5 = 0 

which we write as Ax = y. 


_ - 




r* •* 

2 4 14 4 


Xl 


-1 

12021 


X 2 


0 

00101 


X 3 


1 

2 4 2 4 5 

Dip 


x 4 

1 

! 

0 


X 5 






Writing [A|y|l 4 ] gives 


2 

4 

1 

4 

4 ' 

-l 

1 

0 

0 

cf 

1 

2 

0 

2 

1 

0 

0 

1 

0 

0 

0 

0 

1 

0 

l 

1 

0 

0 

1 

0 

2 

4 

2 

4 

5 

0 

0 

0 

0 

1 ^ 


Row reduction yields 


1 

2 

0 

2 

0 

-2 

-1 

3 

1 

0 

0 

1 

0 

0 

3 

-1 

2 

2 

0 

0 

0 

0 

1 

-2 

1 

-2 

-1 

^0 

0 

0 

0 

0 

0 

1 

0 

-1 


Hence we see that the system is consistent. 



10 0 
0 0 0 
0 10 
0 0 0 
0 0 1 


so 


P = 


"i o o o o" 
0 0 10 0 
0 0 0 0 1 


0 10 0 0 
0 0 0 1 0 



Li 


-13 1 

-12 2 

. 1 - 2-1 


V = 


2 2 
0 0 
0 0 


Hence 


A r = P a 'La 


"-1 3 l o’ 
0 0 0 0 
-12 2 0 
0 0 0 0 
1 -2 -1 0 ., 


and 



h 6 


Now, 

0 0 
1 0 
0 0 
0 1 
0 0 


Hence the most general solution to Ax = y is 


_ ' a ”l 




m ■■ 

-1310 


-1 


-2 -2 

0 0 0 0 


0 


1 0 

-12 2 0 


1 

+ 

0 0 

0 0 0 0 


0 


0 1 

1-2-1 0 

— 


0 0 


where z 



is arbitrary. 



We should comment here that although we may completely analyze 
the system of linear equations Ax = y through the use of g.i. 's, 

this is often not the best way of doing so. There is a well- 

established method for determining whether or not the system 
Ax = y is consistent. This method (Ref. 17) consists of adjoining 
the column vector y to the matrix A to obtain [A|y], and row- 
reducing this "augmented" matrix. The system Ax = y will be con- 
sistent if and only if rk [A|y] = rk A, and when it is consistent, 
we may find a general solution as is shown in (Ref. 17) . The case 
where the g.i. 's are of practical use is when the system Ax = y is 
inconsistent. We then work with A 'Ax = A'y, and we may use Rao's 

method for finding the g.i. of A'A or a g.i. of (A'A) 2 . We may 

then easily compute A + . 


- k'J - 



NOTATION 



implies 

is equivalent to 
belongs to, or is a member of 
is a subset of 
set union 
set intersection 


b K 

g.i. 

A, B, X, Y 
I 

n 

x, y, z 


the collection |Xi, . . , X n j- . 

generalized inverse 

capital letters to denote matrices 

the nxn identity matrix 

small letters at end of alphabet 
denote vectors. 


q-, a, b, c small letters at beginning of 

alphabet or small greek letters 
denote real numbers. 

pro j (y; w) the orthogonal projection of the 

vector y on the subspace W. 


(x, y) 


the inner product of the vector x 
with the vector y. 
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